This assignment is for ETC5521 Assignment 1 by Team taipan comprising of Helen Evangelina and Yiwen Jiang.

1 Introduction and motivation

This report presents the findings of the woodland caribou between 1988 to 2016 following the tracking data conducted under B.C. Ministry of Environment & Climate Change. This report mainly analyses the changes in the number of woodland caribou, and other analyses include the habitats changes caused by seasonal differences, the effects of the implementation of management plans and the causes of tag deployment ended. In the following section, we will describe the data set, where the data came from, and what is the data prepared for. The data description also includes how we transform and clean the raw data for analysis. Our statistical programming used for analysis is R and Rstudio.

1.1 Motivation

Caribou are the only large herbivore that is widely distributed in the high-elevation habitat and act as agents for plant and lichen diversity through the mechanisms of trampling and foraging. The Caribou has also been a significant resource for indigenous peoples for millennia (Environment (2014)). The survival rate of the Caribou is generally relatively low due to predation by Canis Lupus (wolf). The Caribou listed as “vulnerable” on the International Union for Conservation of Nature (IUCN) Red List. With the Caribou being listed as “Threatened”, it is essential to monitor the number of the Caribou as monitoring is vital to effective conservation. We will represent our findings in this report through the exploration of the Caribou tracking data.

1.2 Data Sources

The tracking data was collected by B.C. Ministry of Environment & Climate Change over 28 years (1988 - 2016), the data was prepared for the study of management and recovery of the caribou (BC Ministry of Environment (2014)). It includes the information of 286 Caribou and covered 250,000 locations.

1.3 Data Limitation

Visualise the missing value in individuals data

Figure 1.1: Visualise the missing value in individuals data

  • There are several limitations associated with this dataset. The most noticeable thing about this dataset is that there are a lot of NAs which causes limited analysis. We observe that in the individuals data over half of the values are missing (Refer to Figure 1.1). Not many analysis can be performed because by removing the NAs, there would not be sufficient data left to be analysed and insufficient data would lead to inaccurate result. For example, in the pregnant variable, there are 93.36% of the values missing.

  • The Caribou has a low reproductive rate due to females only have one calf per year, and females do not reproduce until they are two years old. To analysis, the sex ratio should be a good indicator of the trend of the number of Caribou. However, there are only five males Caribou out of 286. The analysis result will exist bias when we use the sex ratio as an indicator.

  • Another limitation is deploy_off_type mainly consists of “unknown”, indicating improper records. This would not lead to accurate analysis. We wanted to see if the equipment failure on the deploy_off_type is many to see if the equipment are working properly or not - and if not, then the quality of the equipment should be improved. However, it turned out that there isn’t any equipment failure value in the deploy_off_type. Same thing with the death_cause which consists many “unknown” values. Other than that, there is inconsistent naming of the values inside the dead_cause variable.

2 Data description

2.1 Overview of the dataset

The dataset tracks woodland caribou in northern British Columbia, published by the Movebank Data Repository at _https://www.datarepository.movebank.org/handle/10255/move.955_. This data was collected by putting trackers of almost 250,000 location tags on 286 Caribou, from 1988 to 2016, which was accessed through Movebank.

The boreal woodland caribou, also known as woodland caribou, boreal forest caribou and forest-dwelling caribou, is a North American subspecies of the reindeer with the vast majority of animals in Canada. They prefer lichen-rich mature forests and mainly live in marshes, bogs, lakes and river regions. Caribou are considered as an ancient member of the deer family Cervidae (Banfield (1974)). They are smaller than Moose (Alces americanus) and Elk (Cervus canadensis), standing 1.0 - 1.2 meters high at the shoulder (Thomas (2002)). Due to the caribou is classified as “Vulnerable” on the International Union for the Conservation of Nature’s (IUCN) Red List. The data provided for the study of the B.C. Ministry of Environment & Climate Change to report the management and recovery of the caribou.

Because this data set is used for analysing the reproduction of species, the data is obtained by observation rather than experiment. There is no treatment group and the control group. The time frame of the collection was started in 1988 and the end of 2016. Movebank collected the locations data of individual animals over time by tracking the biologging sensors attached to animals (Kranstauber et al., 2011). The data sets were separated into two data files and provided by .csv format. The following are the variables in each data.

  • The individual data comes from Mountain caribou in British Columbia-reference-data.csv. The data contains the relevant information of 286 caribou. The variables are showing in the Table 2.1.
Table 2.1: Variables in individuals data
Variable Class Description
animal_id character Individual identifier for animal
sex character Sex of animal
life_stage character Age class (in years) at beginning of deployment
pregnant logical Whether animal was pregnant at beginning of deployment
with_calf logical Whether animal had a calf at time of deployment
death_cause character Cause of death
study_site character Deployment site or colony, or a location-related group such as the herd or pack name
deploy_on_longitude double Longitude where animal was released at beginning of deployment
deploy_on_latitude double Latitude where animal was released at beginning of deployment
deploy_on_comments character Additional information about tag deployment
deploy_off_longitude double Longitude where deployment ended
deploy_off_latitude double Latitude where deployment ended
deploy_off_type character Classification of tag deployment end (see table below for full description
deploy_off_comments character Additional information about tag deployment end
  • The locations comes from Mountain caribou in British Columbia-gps.csv. The data contains location information of each counted caribous for every four fours. The variables are showing in the Table 2.2.
Table 2.2: Variables in individuals data
Variable Class Description
event_id double Identifier for an individual measurement
animal_id character Individual identifier for animal
study_site character Deployment site or colony, or a location-related group such as the herd or pack name
season character Season (Summer/Winter) at time of measurement
timestamp datetime Date and time of measurement
longitude double Longitude of measurement
latitude double Latitude of measurement

2.2 Data cleaning processes

The data being used is the dataset from the Science update for the South Peace Northern Caribou (Rangifer tarandus caribou pop. 15) in British Columbia available from Movebank (BC Ministry of Environment, 2014). The raw datasets are first read by using read_csv() function. It can be noticed from the raw datasets that the variable names use “-” instead of "_“. Using dash in a variable name might result in issues, as the valid variable name in R should consist of dot or underline characters. Another problem from this dataset is the values in the “animal-life-stage” consist of spacing, which might lead to issues as it is inconsistent. Another noticeable thing is the datasets have a lot of NA values. Therefore, the data needs to be cleaned by using the tidyverse and janitor libraries.

To clean the individuals data, firstly clean_names() function from the janitor package is used to return the data.frame with clean names. What this function does is changing the variable names into a tidier form. As mentioned before, using dash in variable names is not appropriate in R. Notice that the raw dataset has names like “deploy-off-latitude” which is changed into “deploy_off_latitude”. Next is to assigned the result to transmute(), which will compute new columns but will drop existing columns. This is done to make the variable names in a tidier way. The whitespace in the life stage is gotten rid to address inconsistent spacing by using str_remove_all() function. After tidying the variable names with transmuting, the “reproductive_condition” variable is separated into “pregnant” and “with_calf” by using the separate() function as this variable actually contains two dimensions, and then assigning those variables into new columns by using the mutate() function which consists of either TRUE or FALSE value.

The locations data is cleaned by using the same method as the individuals data, which includes cleaning the name first by using clean_names() function to arrive at a data.frame with clean names. The next step is to use transmute() function to compute new columns with dropping existing columns. After cleaning both datasets, the final datasets are written into csv format by using write_csv() function.

# Load libraries
library(tidyverse)
library(janitor)

# Import data
individuals_raw <- read_csv("./caribou-location-tracking/raw/Mountain caribou in British Columbia-reference-data.csv")
locations_raw <- read_csv("./caribou-location-tracking/raw/Mountain caribou in British Columbia-gps.csv")

# Clean individuals
individuals <- individuals_raw %>%
  clean_names() %>%
  transmute(animal_id,
            sex = animal_sex,
            # Getting rid of whitespace to address inconsistent spacing
            # NOTE: life stage is as of the beginning of deployment
            life_stage = str_remove_all(animal_life_stage, " "),
            reproductive_condition = animal_reproductive_condition,
            # Cause of death "cod" is embedded in a comment field
            death_cause = str_remove(animal_death_comments, ".*cod "),
            study_site,
            deploy_on_longitude,
            deploy_on_latitude,
            # Renaming to maintain consistency "deploy_on_FIELD" and "deploy_off_FIELD"
            deploy_on_comments = deployment_comments,
            deploy_off_longitude,
            deploy_off_latitude,
            deploy_off_type = deployment_end_type,
            deploy_off_comments = deployment_end_comments) %>%
  # reproductive_condition actually has two dimensions
  separate(reproductive_condition, into = c("pregnant", "with_calf"), sep = ";", fill = "left") %>%
  mutate(pregnant = str_remove(pregnant, "pregnant: ?"),
         with_calf = str_remove(with_calf, "with calf: ?")) %>%
  # TRUE and FALSE are indicated by Yes/No or Y/N
  mutate_at(vars(pregnant:with_calf), ~ case_when(str_detect(., "Y") ~ TRUE,
                                                   str_detect(., "N") ~ FALSE,
                                                   TRUE ~ NA))

# Clean locations
locations <- locations_raw %>%
  clean_names() %>%
  transmute(event_id,
            animal_id = individual_local_identifier,
            study_site = comments,
            season = study_specific_measurement,
            timestamp,
            longitude = location_long,
            latitude = location_lat)

# Write to CSV
write_csv(individuals, "./caribou-location-tracking/individuals.csv")
write_csv(locations, "./caribou-location-tracking/locations.csv")

2.3 Possible questions

This dataset is primarily used to analyse the changes in the number of caribou from 1988 to 2016 to observe the survival of the species. As the management came up with a plan, we would like to analyse whether the management plan is effective in increasing the number of caribou over time.

The primary question to answer from this dataset is how is the trend of the number of caribou over time?

From the primary question, we came up with four secondary questions, which are as follows:
- Do the habitats vary between summer and winter?
- How is the trend of the classification of tag deployment end (deploy_off_type)?
- Has the management plan increased the number of caribou?

3 Analysis and findings

To overview the survival of Caribou in different herds, we will first look at the changing on the number of Caribou which have tracked. We use the location data to represent the changing on the number of caribou in each herd. The location data are tracking the location information of each caribou for every four hours; the tracking conducted until tag deployment ended.

Figure 3.1: Monthly number of Caribou been tracked between 1988 to 2016

As showing in Figure 3.1, the volatility of the number of Caribou tracked are high. The lowest number from 1988 to 2016 is almost 10, but it increased to over fifty by less than five years. The pattern is quite different between 1992 to 2001, that is because there are no tracking records in those years. After 2010, the number of Caribou is decreasing gradually.

Due to the limitations of this plot, some of the information does not present. For example, what causes the number of the Caribou tracked decreasing, does the equipment failure or death of Caribou. If the reason if equipment failure, there is not reasonable to conclude the number of Caribou changed. We will go further to explore findings from the tracking data of Caribou. That include, do the habitats vary between summer and winter? How is the trend of the classification of tag deployment end (deploy_off_type)? Has the management plan increased the number of caribou?

3.1 Question 1: Do the habitats vary between summer and winter?

Seasonal differences of habitats (Coloured by seasons)

Figure 3.2: Seasonal differences of habitats (Coloured by seasons)

The Caribou resides in the British Columbian within eight herd ranges: Hart Ranges, Graham, Moberly, Scott, Burnt Pine, Kennedy Siding, Quintette, and Narraway. Caribou’s habitats vary seasonally to obtain forage, cover, and avoidance from predators.

Typically, in winter, Caribou will select low-elevation forests or windswept alpine ridges where snow cover is relatively shallow to create and forage for terrestrial lichens. The habitats of different seasons have been coloured and showing in Figure @ref(fig:caribou_map), red and blue dots are the locations been recorded separately in summer and winter. Most of Caribou’s habitats overlapped between summer and winter. There still has some difference. In summer, the range of Caribou activities is more concentrated on the mountains, but in winter the range of Caribou activities is flatter.

Seasonal differences of habitats (Coloured by herds and separated by herds)

Figure 3.3: Seasonal differences of habitats (Coloured by herds and separated by herds)

Then we separate the habitats with seasonality and coloured by the herds; it is present in 3.3. We can observe that some of the locations are the difference between seasons. Compared to summer, Caribou in the same herd are more concentrated in winter. In summer, most of the Caribou migrate towards the central core of the Rocky Mountains (high-elevation habitats) because wolves live primarily at low-elevation that spatial separation can help to avoid predators. This plot is still hard to analysis the difference of habitats for each herd because the habitats of some herds overlapped. For example, the dots where the Burnt Pine herd inhabits covered by the Kennedy herd.

Seasonal differences of habitats (Separated by herds and seasons)

Figure 3.4: Seasonal differences of habitats (Separated by herds and seasons)

In Figure 3.4, the map has separated by seasons and herds. The red dot represents summer, and blue one represents winter. It presents the variety of habitats between seasons across herds. Burnt Pine, Kennedy, Narraway and Scott herds show the Caribou are getting much closer in winter comparing with summer. From Narraway we can observe that Caribou move from mountains to flatten area in winter more.

3.2 Question 2: How is the trend of the classification of tag deployment end (deploy_off_type)? What is the highest cause of dead?

In order to answer this question, we analyse the deploy_off_type variable by visualising it on bar charts.

Figure 3.5: Distribution of the classification of tag deployment end.

It can be seen from figure 3.5 that out of all the deploy_off_type stated in the data description, there are only four types available in the dataset, which are dead, removal, other and unknown. The highest number of type is unknown, with 99, which indicates that there was no proper tracking and documentation. The second highest type is removal, which means that the tag was purposefully removed from the animal. Following removal is dead with 60, which is still a lot. The least number of type is other, which might consist of other deployment end types.

As there are several data with dead as the deployment end type, we are looking at the cause of death of caribou in more details. It is important to look at the cause of death as understanding the cause of death of the caribou would allow further monitoring of the caribou to minimise their death. To analyse the cause of death, the individuals dataset is wrangled by filtering the deploy_off_type. As there are some NAs in the death_cause column, the NAs are removed.

And then a bar plot is created and reordered by the number of the death_cause occuring in the data.

Figure 3.6: Total number of death caused by each cause of death.

It can be seen from figure 3.6 that the highest number of cause of death is predation by wolf, which is 18 deaths, if the suspected predation ones are included. The other substantial part of the cause of death includes predation by bear or predation by unknown animals, which concludes that the majority of the death of the caribou was due to predation with around 30% of the total death. The other causes such as vehicle collision, train collision, and accidentals are only partaking a very small part of the death causes. However, there is also a significant amount of unknown which indicates that monitoring and documentation is not really good.

What we can learn from the analysis is that the majority cause of calf mortality is predation, primarily by wolf, and it is important to take this into consideration when the management are developing plans to increase the number of caribou. The seasonal habitat change that is compromising spatial separation between caribou and the predators can increase the risk of predation. Grey wolves normally live in low elevations, however they occasionally hunt at higher elevations where caribou live (BC Ministry of Environment (2014)). Another reason for the increasing encounter rate of caribou by wolves is the development of linear features such as pipelines and roads which facilitates rapid and widespread movements of wolves (H. W. McKenzie (2012)).

3.3 Question 3: Has the management plan increased the number of caribou?

According to the Peace Northern Caribou Plan endorsed in November 2012, the goal is to increase the South Peace Northern Caribou to 1,200 animals within 21 years, which has been articulated in implementation plans. Therefore, we are analysing whether the management plan has increased the number of caribou by looking at the number of caribou in the dataset.

Figure 3.7: The trend of the number of caribou per year.

Figure 3.7 presents the trend of the number of caribou yearly according to the locations dataset. It can be noticed that the number of caribou fluctuated overtime. It declined by a significant amount in 2001, however the number increased sharply again in 2002 and 2003, and declined again in 2006. The peak was at 2008, where the number of caribou reached 70, which is more than double the number of caribou in 2006. After 2008, the number decreased again. The management plan which was started to be implemented in 2012 seems to be ineffective as the number of caribou still ranges at around 20-30 caribou.

Next, we are looking at the number of caribou’s trend yearly in more details by looking at the trend per study_site.

Figure 3.8: Trend of number of caribou per year according to the study site.

Figure 3.8 above is showing the changes in the number of caribou yearly based on the study site. The most noticeable thing here is that not all study sites are being recorded every year, and that the trend of number of caribou varies over the study site. From 1988 to 2001, the only study site being recorded was Hart Ranges, in where there was a huge drop between this period. Most of the other study sites started to be recorded since 2002, except for Scott which started in 2013. Most of the study sites are showing a sharp decline in the number of caribou in 2010, which explains the huge drop in the overall trend mentioned before. The number of caribou in Narraway is super low which is due to the herds that are living in this site tend to stay in the lower-elevation area during summer, thus are more exposed to predation risk. Looking at the number of caribou from 2012, there has not been any significant changes in the number of caribou. Therefore, it appears that the management plan has not increased the number of caribou.

So to answer whether the management plan has increased the number of caribou, it appears that the management plan has not increased the number of caribou. And it is unlikely that the number of caribou would increase by that much in 2021. However, this analysis might not be accurate as there are some limitations associated with this analysis. Firstly, some study sites were only being monitored for certain years, which results to inaccurate result to analyse the trend of number of caribou. Secondly, there might be some caribou that were not tracked in certain years as we are calculating the number of caribou per year by using the documented locations data. It might not be an accurate analysis as there might be some caribou that were not tracked in certain years, or that the locations were not being documented inside the locations dataset.

4 Conclusion

Through the above analysis, we can conclude that the population estimate of the Caribou decreased over a decade. As a result of the volatility, the long-term trend can not be forecasted, but in the short-term, the number of Caribou will decrease. The habitats of Caribou are varying between summer and winter. From winter to summer, they move from low-elevation to high-elevation for the breed and to avoid predators. The highest number of tag deployment end classification is unknown which indicates improper documentation, followed by removal and death. Predation, primarily by wolf, is the majority cause of calf mortality, which is due to seasonal habitat change and the development of roads and pipelines. The management plan seems to be ineffective as the number of caribou has not increased.

5 Acknowledgement

Thanks for tidytuesday provided dataset, relevant articles and cleaning process on GitHub (Hughes (2020)). The dataset used is Caribou tracking dataset offered by Movebank Data Repository (BC Ministry of Environment (2014)). Packages used are Wickham (2016), Wickham et al. (2019), Sievert (2020), Zhu (2019), Grolemund and Wickham (2011), Wickham et al. (2020), Tierney et al. (2020), Kahle and Wickham (2013), Yutani (2020) and Tierney (2017),

References

Banfield, A. W. F. 1974. “The Mammals of Canada. Univ. Toronto Press, Toronto, on.” 57: 419.

BC Ministry of Environment. 2014. Science Update for the South Peace Northern Caribou (Rangifer Tarandus Caribou Pop. 15) in British Columbia. https://www2.gov.bc.ca/assets/gov/environment/plants-animals-and-ecosystems/wildlife-wildlife-habitat/caribou/science_update_final_from_web_jan_2014.pdf.

Environment, BC Ministry of. 2014. “Science Update for the South Peace Northern Caribou (Rangifer Tarandus Caribou Pop. 15) in British Columbia.” https://www2.gov.bc.ca/assets/gov/environment/plants-animals-and-ecosystems/wildlife-wildlife-habitat/caribou/science_update_final_from_web_jan_2014.pdf.

Grolemund, Garrett, and Hadley Wickham. 2011. “Dates and Times Made Easy with lubridate.” Journal of Statistical Software 40 (3): 1–25. http://www.jstatsoft.org/v40/i03/.

Hughes, Ellis. 2020. TidytuesdayR: Access the Weekly ’Tidytuesday’ Project Dataset. https://github.com/thebioengineer/tidytuesdayR.

H. W. McKenzie, R. J. Spiteri, E. H. Merrill. 2012. “How Linear Features Alter Predator Movement and the Functional Response.” Interface Focus 2(2): 205–16. https://doi.org/10.1098/rsfs.2011.0086.

Kahle, David, and Hadley Wickham. 2013. “Ggmap: Spatial Visualization with Ggplot2.” The R Journal 5 (1): 144–61. https://journal.r-project.org/archive/2013-1/kahle-wickham.pdf.

Sievert, Carson. 2020. Interactive Web-Based Data Visualization with R, Plotly, and Shiny. Chapman; Hall/CRC. https://plotly-r.com.

Thomas, D. R. Gray, D. C. 2002. “Update Cosewic Status Report on the Woodland Caribou Rangifer Tarandus Caribou in Canada.” 41. https://doi.org/10.21105/joss.00355.

Tierney, Nicholas. 2017. “Visdat: Visualising Whole Data Frames.” JOSS 2 (16): 355. https://doi.org/10.21105/joss.00355.

Tierney, Nicholas, Di Cook, Miles McBain, and Colin Fay. 2020. Naniar: Data Structures, Summaries, and Visualisations for Missing Data. https://CRAN.R-project.org/package=naniar.

Wickham, Hadley. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org.

Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.

Wickham, Hadley, Romain François, Lionel Henry, and Kirill Müller. 2020. Dplyr: A Grammar of Data Manipulation. https://CRAN.R-project.org/package=dplyr.

Yutani, Hiroaki. 2020. Gghighlight: Highlight Lines and Points in ’Ggplot2’. https://CRAN.R-project.org/package=gghighlight.

Zhu, Hao. 2019. KableExtra: Construct Complex Table with ’Kable’ and Pipe Syntax. https://CRAN.R-project.org/package=kableExtra.